Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Prediction model of lncRNA-encoded short peptides based on representation learning and deep forest
Tengqi JI, Jun MENG, Siyuan ZHAO, Hehuan HU
Journal of Computer Applications    2021, 41 (12): 3614-3619.   DOI: 10.11772/j.issn.1001-9081.2021061082
Abstract250)   HTML14)    PDF (891KB)(97)       Save

Small Open Reading Frames (sORFs) in long non-coding RNA (lncRNA) can encode short peptides with length no more than 100 amino acids. Aiming at the problem that the features of sORFs in lncRNA are not distinct and the data with high reliability are not enough in short peptide prediction research, a Deep Forest (DF) model based on representation learning was proposed. Firstly, the conventional lncRNA feature extraction method was used to encode the sORFs. Secondly, the AutoEncoder (AE) was used to perform representation learning to obtain highly efficient representation of the input data. Finally, a DF model was trained to predict the short peptides encoded by lncRNA. Experimental results show that the accuracy of this model can achieve 92.08% on Arabidopsis thaliana dataset, which is higher than those of the traditional machine learning models , deep learning models and combined models, and this model has better stability. In addition, the prediction accuracy of this method can reach 78.16% and 74.92% on Glycine max and Zea mays datasets respectively, verifying the good generalization ability of the proposed model.

Table and Figures | Reference | Related Articles | Metrics